Fully Bayesian Unsupervised Disease Progression Modeling

نویسنده

  • Arya Pourzanjani
چکیده

We present a practical implementation of a fully unsupervised disease progression model [10]. The implementation utilizes all new components we developed for generic use in Bayesian disease progression modeling. It improves upon [10] by providing a more informative fully Bayesian approach and a faster inference algorithm. The implementation is completely built on the pyMC3 open-source library making it easy to extend the model and apply to new settings. 1 Disease Progression Models Traditionally, disease severity and progression have been assessed manually by physicians using guidelines such as the GOLD criteria for COPD [6]. These guidelines are typically based on rules applied to the patient’s biomarkers, demographics, and other data easily extracted from health records. The sub-area of machine learning called disease progression modeling (DPM) focuses on automating this process [5]. Automation leads to more accurate diagnoses and optimal treatment paths which can literally be the difference between life and death as in the case of coagulopathy patients [9]. More broadly, we expect that algorithms that learn disease progression models from electronic health records will lead to new insights on the progression of rare and difficult to stage chronic diseases, guiding both clinical practice and medical research. 2 Bayesian Models and pyMC3 Bayesian networks provide a natural framework for modeling disease progression. They allow for the flexible modeling of “hidden states” which often arise in medical scenarios where measurements are simply proxies for variables of interest. Furthermore, Bayesian posteriors provide a full description of parameters of interest as oppose to point estimates and simple confidence intervals. Several examples of Bayesian network models for disease progression exist in the literature [1, 2, 4, 7, 10]. pyMC3 is a Python module that provides a unified and comprehensive framework for fitting Bayesian models using MCMC [8]. pyMC3’s key strength is its modularity and extensibility: ran∗Research performed while interning at Evidation Health

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Bayesian Unsupervised Word Segmentation with Nested Pitman-Yor Language Modeling

In this paper, we propose a new Bayesian model for fully unsupervised word segmentation and an efficient blocked Gibbs sampler combined with dynamic programming for inference. Our model is a nested hierarchical Pitman-Yor language model, where Pitman-Yor spelling model is embedded in the word model. We confirmed that it significantly outperforms previous reported results in both phonetic transc...

متن کامل

Bayesian approaches in Natural Language Processing

This paper overviews Bayesian approaches in natural language processing that are becoming prominent. Without any knowledge of natural language processing, Bayesian approaches to both discriminative learning and generative modeling are described. Especially, näıve bayes and its full unsupervised Bayesian modeling, DM, and LDA are developed. These Bayesian approaches permit interesting joint mode...

متن کامل

Weakly Supervised Part-of-Speech Tagging for Morphologically-Rich, Resource-Scarce Languages

This paper examines unsupervised approaches to part-of-speech (POS) tagging for morphologically-rich, resource-scarce languages, with an emphasis on Goldwater and Griffiths’s (2007) fully-Bayesian approach originally developed for English POS tagging. We argue that existing unsupervised POS taggers unrealistically assume as input a perfect POS lexicon, and consequently, we propose a weakly supe...

متن کامل

Bayesian Nonparametric Collaborative Topic Poisson Factorization for Electronic Health Records-Based Phenotyping

Phenotyping with electronic health records (EHR) has received much attention in recent years because the phenotyping opens a new way to discover clinically meaningful insights, such as disease progression and disease subtypes without human supervisions. In spite of its potential benefits, the complex nature of EHR often requires more sophisticated methodologies compared with traditional methods...

متن کامل

Non-stationary Clustering Bayesian Networks for glaucoma

Glaucoma is a major cause of blindness and its mechanisms are not fully understood. The progression of the disease can be slowed by early diagnosis, but this is a difficult task because available data is typically noisy and has high variability. Several artificial intelligence approaches have been used in this context, although they generally don’t exploit the temporal nature of the data. Here ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015